Suboptimality of penalties proportional to the dimension for model selection in heteroscedastic regression

نویسنده

  • Sylvain Arlot
چکیده

We consider the problem of choosing between several models in least-squares regression with heteroscedastic data. We prove that any penalization procedure is suboptimal when the penalty is proportional to the dimension of the model, at least for some typical heteroscedastic model selection problems. In particular, Mallows’ Cp is suboptimal in this framework, as well as any “linear” penalty depending on both the data and their true distribution. On the contrary, optimal model selection is possible in this framework with data-driven penalties such as V -fold or resampling penalties (Arlot, 2008a,b). Therefore, estimating the “shape” of the penalty from the data is useful, even at the price of a higher computational cost. AMS 2000 subject classifications: Primary 62G08; secondary 62G05,62J05.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Choosing a penalty for model selection in heteroscedastic regression

Penalization is a classical approach to model selection. In short, penalization chooses the model minimizing the sum of the empirical risk (how well the model fits data) and of some measure of complexity of the model (called penalty); see FPE [1], AIC [2], Mallows’ Cp or CL [22]. A huge amount of literature exists about penalties proportional to the dimension of the model in regression, showing...

متن کامل

Optimal model selection for stationary data under various mixing conditions

The history of statistical model selection goes back at least to Akaike [Aka70], [Aka73] and Mallows [Mal73]. They proposed to select among a collection of parametric models the one which minimizes an empirical loss plus some penalty term proportional to the dimension of the models. Birgé & Massart [BM97] and Barron, Birgé & Massart [BBM99] generalize this approach, making the link between mode...

متن کامل

Penalized Bregman Divergence Estimation via Coordinate Descent

Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...

متن کامل

Adaptive-LASSO for Cox’s Proportional Hazards Model

We investigate the variable selection problem for Cox’s proportional hazards model, and propose a unified model selection and estimation procedure with desired theoretical properties and computational convenience. The new method is based on a penalized log partial likelihood with the adaptively-weighted L1 penalty on regression coefficients, and is named adaptive-LASSO (ALASSO) estimator. Inste...

متن کامل

V-fold cross-validation improved: V-fold penalization

We study the efficiency of V -fold cross-validation (VFCV) for model selection from the non-asymptotic viewpoint, and suggest an improvement on it, which we call “V -fold penalization”. Considering a particular (though simple) regression problem, we prove that VFCV with a bounded V is suboptimal for model selection, because it “overpenalizes” all the more that V is large. Hence, asymptotic opti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008